Sentiment Analysis of Short Informal Texts
نویسندگان
چکیده
We describe a state-of-the-art sentiment analysis system that detects (a) the sentiment of short informal textual messages such as tweets and SMS (message-level task) and (b) the sentiment of a word or a phrase within a message (term-level task). The system is based on a supervised statistical text classification approach leveraging a variety of surfaceform, semantic, and sentiment features. The sentiment features are primarily derived from novel high-coverage tweet-specific sentiment lexicons. These lexicons are automatically generated from tweets with sentiment-word hashtags and from tweets with emoticons. To adequately capture the sentiment of words in negated contexts, a separate sentiment lexicon is generated for negated words. The system ranked first in the SemEval-2013 shared task ‘Sentiment Analysis in Twitter’ (Task 2), obtaining an F-score of 69.02 in the message-level task and 88.93 in the term-level task. Post-competition improvements boost the performance to an F-score of 70.45 (message-level task) and 89.50 (term-level task). The system also obtains state-ofthe-art performance on two additional datasets: the SemEval-2013 SMS test set and a corpus of movie review excerpts. The ablation experiments demonstrate that the use of the automatically generated lexicons results in performance gains of up to 6.5 absolute percentage points.
منابع مشابه
Sentiment analysis methods in Sentiment analysis methods in Persian text: A survey
With the explosive growth of social media such as Twitter, reviews on e-commerce website, and comments on news websites, individuals and organizations are increasingly using opinions in these media for their decision making. Sentiment analysis is one of the techniques used to analyze userschr('39') opinions in recent years. Persian language has specific features and thereby requires unique meth...
متن کاملOn Evaluating the Contribution of Text Normalisation Techniques to Sentiment Analysis on Informal Web 2.0 Texts∗ Evaluación de la Contribución de la Normalización al Análisis de Sentimiento en Textos Informales de la Web 2.0
The writing style used in social media usually contains informal elements that can lower the performance of Natural Language Processing applications. For this reason, text normalisation techniques have drawn a lot of attention recently when dealing with informal content. However, not all the texts present the same level of informality and may not require additional pre-processing steps. Therefo...
متن کاملOpinion Mining on Non-English Short Text
As the type and the number of such venues increase, automated analysis of sentiment on textual resources has become an essential data mining task. In this paper, we investigate the problem of mining opinions by extracting aspects of entities on the collection of informal short texts. Both positive and negative sentiment strength of texts are detected. We focus on a non-English language that has...
متن کاملComparison of Short-Text Sentiment Analysis Methods for Croatian
We focus on the task of supervised sentiment classification of short and informal texts in Croatian, using two simple yet effective methods: word embeddings and string kernels. We investigate whether word embeddings offer any advantage over corpusand preprocessing-free string kernels, and how these compare to bag-ofwords baselines. We conduct a comparison on three different datasets, using diff...
متن کاملLyS: Porting a Twitter Sentiment Analysis Approach from Spanish to English
This paper proposes an approach to solve messageand phrase-level polarity classification in Twitter, derived from an existing system designed for Spanish. As a first step, an ad-hoc preprocessing is performed. We then identify lexical, psychological and semantic features in order to capture different dimensions of the human language which are helpful to detect sentiment. These features are used...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- J. Artif. Intell. Res.
دوره 50 شماره
صفحات -
تاریخ انتشار 2014